24 research outputs found

    On the use of reproducing kernel Hilbert spaces in functional classification

    Full text link
    The H\'ajek-Feldman dichotomy establishes that two Gaussian measures are either mutually absolutely continuous with respect to each other (and hence there is a Radon-Nikodym density for each measure with respect to the other one) or mutually singular. Unlike the case of finite dimensional Gaussian measures, there are non-trivial examples of both situations when dealing with Gaussian stochastic processes. This paper provides: (a) Explicit expressions for the optimal (Bayes) rule and the minimal classification error probability in several relevant problems of supervised binary classification of mutually absolutely continuous Gaussian processes. The approach relies on some classical results in the theory of Reproducing Kernel Hilbert Spaces (RKHS). (b) An interpretation, in terms of mutual singularity, for the "near perfect classification" phenomenon described by Delaigle and Hall (2012). We show that the asymptotically optimal rule proposed by these authors can be identified with the sequence of optimal rules for an approximating sequence of classification problems in the absolutely continuous case. (c) A new model-based method for variable selection in binary classification problems, which arises in a very natural way from the explicit knowledge of the RN-derivatives and the underlying RKHS structure. Different classifiers might be used from the selected variables. In particular, the classical, linear finite-dimensional Fisher rule turns out to be consistent under some standard conditions on the underlying functional model

    On the maximum bias functions of MM-estimates and constrained M-estimates of regression

    Full text link
    We derive the maximum bias functions of the MM-estimates and the constrained M-estimates or CM-estimates of regression and compare them to the maximum bias functions of the S-estimates and the Ï„\tau-estimates of regression. In these comparisons, the CM-estimates tend to exhibit the most favorable bias-robustness properties. Also, under the Gaussian model, it is shown how one can construct a CM-estimate which has a smaller maximum bias function than a given S-estimate, that is, the resulting CM-estimate dominates the S-estimate in terms of maxbias and, at the same time, is considerably more efficient.Comment: Published at http://dx.doi.org/10.1214/009053606000000975 in the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org

    On a general definition of the functional linear model

    Get PDF
    A general formulation of the linear model with functional (random) explanatory variable X=X(t),t∈TX = X(t), t \in T , and scalar response Y is proposed. It includes the standard functional linear model, based on the inner product in the space L2[0,1]L^2[0,1], as a particular case. It also includes all models in which Y is assumed to be (up to an additive noise) a linear combination of a finite or countable collections of marginal variables X(t_j), with tj∈Tt_j\in T or a linear combination of a finite number of linear projections of X. This general formulation can be interpreted in terms of the RKHS space generated by the covariance function of the process X(t). Some consistency results are proved. A few experimental results are given in order to show the practical interest of considering, in a unified framework, linear models based on a finite number of marginals X(tj)X(t_j) of the process X(t)X(t)

    A geometrically motivated parametric model in manifold estimation,

    Full text link
    The general aim of manifold estimation is reconstructing, by statistical methods, an mm-dimensional compact manifold SS on Rd{\mathbb R}^d (with m≤dm\leq d) or estimating some relevant quantities related to the geometric properties of SS. We will assume that the sample data are given by the distances to the (d−1)(d-1)-dimensional manifold SS from points randomly chosen on a band surrounding SS, with d=2d=2 and d=3d=3. The point in this paper is to show that, if SS belongs to a wide class of compact sets (which we call \it sets with polynomial volume\rm), the proposed statistical model leads to a relatively simple parametric formulation. In this setup, standard methodologies (method of moments, maximum likelihood) can be used to estimate some interesting geometric parameters, including curvatures and Euler characteristic. We will particularly focus on the estimation of the (d−1)(d-1)-dimensional boundary measure (in Minkowski's sense) of SS. It turns out, however, that the estimation problem is not straightforward since the standard estimators show a remarkably pathological behavior: while they are consistent and asymptotically normal, their expectations are infinite. The theoretical and practical consequences of this fact are discussed in some detail.Comment: Statistics: A Journal of Theoretical and Applied Statistics, 201

    On functional logistic regression: some conceptual issues

    Full text link
    The main ideas behind the classic multivariate logistic regression model make sense when translated to the functional setting, where the explanatory variable X is a function and the response Y is binary. However, some important technical issues appear (or are aggravated with respect to those of the multivariate case) due to the functional nature of the explanatory variable. First, the mere definition of the model can be questioned: While most approaches so far proposed rely on the L2-based model, we explore an alternative (in some sense, more general) approach, based on the theory of reproducing kernel Hilbert spaces (RKHS). The validity conditions of such RKHS-based model, and their relation with the L2-based one, are investigated and made explicit in two formal results. Some relevant particular cases are considered as well. Second, we show that, under very general conditions, the maximum likelihood of the logistic model parameters fails to exist in the functional case, although some restricted versions can be considered. Third, we check (in the framework of binary classification) the practical performance of some RKHS-based procedures, well-suited to our model: They are compared to several competing methods via Monte Carlo experiments and the analysis of real data setsThis work has been partially supported by Spanish Grant PID2019-109387GB-I0

    Uniform strong consistency of robust estimators

    No full text
    In the robustness framework, the distribution underlying the data is not totally specified and, therefore, it is convenient to use estimators whose properties hold uniformly over the whole set of possible distributions. In this paper, we give two general results on uniform strong consistency and apply them to study the uniform consistency of some classes of robust estimators over contamination neighborhoods. Some instances covered by our results are Huber's M-estimators, quantiles, or generalized S-estimators.Uniform strong consistency Robustness M-estimators GS-estimators
    corecore